Fix pathologically slow assertion diffs for large inputs (#8998)#14543
Fix pathologically slow assertion diffs for large inputs (#8998)#14543kirilklein wants to merge 1 commit into
Conversation
|
We have a flying MR to use generator in assert repr that could help with this when we don't have to show the actual output. (#14523) |
…8998) Comparing very large strings, lists, or dataclasses in an ``assert`` could hang for a long time (sometimes minutes) while pytest built the failure diff. The cost comes from ``difflib.ndiff``: its character-level "fancy replace" step is quadratic in the size of the differing region, and the underlying ``SequenceMatcher`` is quadratic in the number of lines (a large nested structure can pretty-print to hundreds of thousands of lines). Add a deterministic size heuristic (no wall-clock timeouts, per the maintainer discussion in the issue): when the input is too large for ``ndiff`` to be fast, fall back to a coarser line-level ``unified_diff``, capped to a bounded number of lines so it always completes in milliseconds, and note this in the output. Smaller comparisons keep the existing detailed ``ndiff`` output unchanged.
c992d71 to
e232573
Compare
|
Thanks! I looked at #14523. It and this PR are complementary:
They do overlap in |
Pierre-Sassoulas
left a comment
There was a problem hiding this comment.
I think this make sense, ndiff is really costly and if they're a ton of changes no one is going to look at everything in great details. Maybe we can make some lines fancy and not show everything instead of showing all the lines as non fancy though. Or making only the first line fancy because -vvv means show me the full diff after all.
| size = sum(len(line) for line in left_lines) + sum( | ||
| len(line) for line in right_lines | ||
| ) |
There was a problem hiding this comment.
We're summing everything here, we need to fast exit as soon as size become greater than NDIFF_MAX_INPUT_SIZE
| yield ( | ||
| f"Diff too large to compute in full (over {NDIFF_MAX_INPUT_SIZE} " | ||
| "characters); showing a faster line-level diff instead:" | ||
| ) |
There was a problem hiding this comment.
Message is wrong here, could be either too many line or too many chars.
| left_lines = left.splitlines(keepends) | ||
| right_lines = right.splitlines(keepends) |
There was a problem hiding this comment.
Do we have to split lines ? Can't we just count the line separator ?
| assert ndiff_too_slow(["spam"], ["eggs"]) is False | ||
|
|
||
| def test_many_characters_is_too_slow(self) -> None: | ||
| assert ndiff_too_slow(["a" * 6000], ["b" * 6000]) is True |
There was a problem hiding this comment.
Let's mock the values, we don't have to actually construct an enormous list to test the behavior
| assert "- " + "a" * 50 + "eggs" in lines | ||
| assert "+ " + "a" * 50 + "spam" in lines | ||
|
|
||
| def test_text_diff_large_input_skips_ndiff(self) -> None: |
Closes #8998.
Problem
Comparing very large strings, lists, or dataclasses inside an
assertcan hang for a long time (sometimes minutes) while pytest builds the failure diff.Profiling the reproductions from the issue confirms the root cause is
difflib.ndiff:SequenceMatcheris quadratic in the number of lines — a large nested structure pretty-prints to a huge number of lines (the dataclass example in the issuepformats to ~418,000 lines).Approach
Following the maintainer discussion in the issue, this uses a deterministic size heuristic rather than wall-clock timeouts (which are non-deterministic and can't reliably interrupt
difflib).A new helper module
_pytest/assertion/_diff.pyprovides:ndiff_too_slow(left_lines, right_lines)—Truewhen the combined input exceeds a character budget or a line-count budget, the two dimensions that makendiffslow.fast_unified_diff(...)— a coarse but fast line-leveldifflib.unified_diff, capped to a bounded number of lines so it always completes in milliseconds. It notes in the output that a faster diff is being shown (and how many lines were hidden).Both pathological call sites fall back to it when needed:
compare_text._diff_text(string comparisons)_compare_sequence._compare_eq_iterable(list / dataclass / iterable comparisons)Comparisons below the cutoffs keep the existing detailed
ndiffoutput unchanged.Results
On the reproductions from the issue (dataclass with large lists + two large random strings), with
-v:find_longest_match)Tests
Added regression tests in
testing/test_assertion.py: unit tests for thendiff_too_slowheuristic, and integration tests that large string / many-line / large-iterable comparisons fall back to the fast diff (nondiff?guide lines), still show which lines differ, and emit the line-cap notice. Thresholds were chosen from benchmarking.🤖 Generated with Claude Code